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ABSTRACT 


The  stepwise  multiple  regression  technique  is  used  in  a  model  building 
process  to  develop  predictors  of  temperature,  salinity,  and  sound  velocity  as 
functions  of  geographical  location,  time,  and  depth.  Models  which  give 
reasonable  results  are  obtained  through  successive  trials  using  higher  order 
terms  of  the  independent  variables.  The  model  for  sound  velocity  yields  values 
which  are  nearly  identical  to  the  Wilson  sound  velocities  contained  in  the 
ocean  station  file  and  values  computed  using  a  modified  version  of  the 
MacKenzie  equation. 

The  distribution  of  residuals  resulting  from  comparisons  of  the  Wilson 
equation  sound  velocities  to  those  obtained  from  the  regression  model  (both 
computed  from  actual  temperature  and  salinities)  shows  that  98%  fall  within 
the  range  of  ±2  m/sec.  A  comparison  of  the  regression  model  sound  velocity 
values  computed  from  regression  predictions  of  temperature  and  salinity  with 
the  Wilson  values  shows  that  88%  of  the  residuals  fall  in  the  range  of 
±12  m/sec. 

The  results,  which  are  valid  for  the  4°  square  centered  at  37.5°  North 
latitude  and  69.5°  West  longitude,  are  discussed  in  terms  of  the  statistical 
significance  of  the  distribution  of  the  residuals.  Since  the  physical  character¬ 
istics  of  the  area  selected  are  rather  complex,  the  application  of  this  technique 
to  other  parts  of  the  ocean  is  recommended. 
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I.  INTRODUCTION 

In  the  field  of  oceanography,  there  is  a  real  need  for 
accurate  numerical  procedures  for  determining  sound  velocity 
in  sea  water  based  on  certain  ocean  variables  such  as  lati¬ 
tude,  longitude,  temperature,  salinity,  and  day-of-year. 

Although  the  technique  of  stepwise  multiple  regression 
readily  lends  itself  as  a  tool  of  analysis  in  the  develop¬ 
ment  of  polynomial  prediction  equations  of  the  form 
n 

y  =  Z  3.x.  where  x.=f(z,z,...,z) 
i=o  ^  ^  1  1*2*  n 

where  the  3^  are  the  coefficients  to  be  determined  and  the 
z^  are  the  independent  variables  in  the  model,  no  reliable 
numerical  method  exists  which  eliminates  the  need  for  on- 
location  measurements  of  certain  variables  such  as  temper¬ 
ature,  salinity  and  pressure.  Once  the  values  of  these 
variables  are  known,  however,  one  may  use  one  of  a  number  of 
well  known  reliable  equations  for  computing  sound  velocity. 
Two  such  equations  utilized  in  this  study  are  those  of  H.  V. 
Mackenzie^  and  Wayne  D.  Wilson.^ 

It  is  the  purpose  of  this  study  to  adequately  predict 
sound  velocity  at  any  location  within  a  given  range  of 
latitude  and  longitude  without  going  to  that  particular 
location  to  measure  variables  such  as  temperature  and 
salinity.  In  order  to  do  this,  however,  it  is  required  that 
temperature  and  salinity  be  predicted  to  a  certain  degree  of 
accuracy.  This  will  involve  examining  a  number  of  classes 
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of  models. 

The  problem  of  developing  prediction  equations  for 
temperature,  salinity,  and  sound  velocity  is  further  com¬ 
plicated  by  other  factors,  most  of  which  are  uncontrollable. 
Some  of  these  factors  are  time  series  autocorrelation  in 
the  data,  errors  due  to  instrumentation,  missing  data,  land 
masses,  underwater  streams  or  currents,  temperature  inver¬ 
sions,  and  sparse  data,  to  mention  a  few.  All  of  these 
factors  have  their  individual  effects  on  the  generalized 
regression  development.  The  effects  of  some  of  these 
factors  will  be  discussed  in  the  following  chapter.  It  is 
hoped,  of  course,  that  errors  due  to  these  factors  will 
occur  randomly. 

When  dealing  with  oceanographic  problems,  the  handling 
of  data  becomes  an  obstacle.  While  the  data  for  a  given 
square  (x°  by  x° )  is  relatively  sparse,  the  total  amount  of 
data  for  this  square  is  extremely  voluminous.  Consequently, 
most  of  the  conclusions  of  this  study  are  based  on  data  from 
the  4°  by  4°  square  36°  -  40°N  latitude  and  68°  -  72°  west 
longitude.  The  convention  used  is  as  follows:  North  lati¬ 
tude  is  positive;  west  longitude  is  negative. 

The  execution  time  for  the  stepwise  multiple  regression 
procedure  when  a  large  model  is  under  consideration  is  quite 
long.  For  purposes  of  economy  the  goal  is  to  determine  a 
model  with  as  few  terms  as  possible  which  does  an  adequate 
job  of  predicting.  This  requires  extensive  trial  and  error 
model  refinement. 
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Stringent  accuracy  requirements  are  needed  to  qualify 
the  regression  analysis  as  an  acceptable  subsystem  to  the 
more  extensive,  overall  "Ocean  Station  Display  System"  and 
the  "quick  look"  facility  utilizing  a  cathode  ray  tube,  now 
under  development  by  Mr.  Richard  Bolton.  The  regression 
equations,  which  are  surfaces  when  plotted,  can  be  instru¬ 
mental  in  the  display  of  temperature,  salinity,  and  sound 
velocity  contours  in  the  graphic  display  system. 

Initially,  a  simple  model  will  be  considered  at  each 
of  the  depth  planes  in  the  4°  by  4°  square  36°  -  40°  N 
latitude  and  68°  -  72°W  longitude.  This  will  yield  a  set 
of  regression  equations  for  temperature,  salinity,  and  sound 
velocity  for  each  depth  plane  consisting  of  terms  not  re¬ 
jected  by  the  predetermined  accuracy  criterion. 

A  more  general  regression  situation  is  then  considered 
where  an  equation  is  developed  using  depth  as  one  of  the 
independent  variables .  This  results  in  the  development  of 
one  equation  for  each  of  the  dependent  variables  temperature , 
salinity  and  sound  velocity,  which  is  general  for  all  depth 
planes . 

Many  general  regression  models  involving  as  many  as  six 
independent  variables  with  up  to  tenth  order  cross  products 
were  tried.  The  process  of  developing  the  models  involved 
trial  and  error  addition  and  deletion  of  cross  products  of 
the  various  independent  variables .  Several  interesting 
combinations  were  tried  and  the  models  which  produced  the 
best  results  are  discussed  in  the  latter  part  of  Chapter  III. 


The  primary  objective  of  this  study  is  to  develop 
equations  which  may  be  used  to  predict  temperature  and 
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salinity  using  only  controllable  variables  which  may  be  set 
by  the  user.  Once  these  values  are  known,  they  may  be  used 
in  some  sound  velocity  equation  such  as  Wilson's,  Macken¬ 
zie's,  or  the  regression  sound  velocity  equation. 

Once  the  temperature  and  salinity  equations  are  devel¬ 
oped,  five  sound  velocity  calculations  are  possible  for 
each  observation  card.  Given  latitude,  longitude,  and 
depth,  a  temperature  and  salinity  may  be  calculated  from  the 
respective  regression  equations.  This  allows  calculation 
of  sound  velocity  from  Mackenzie's  equation  and  the  regres¬ 
sion  sound  velocity  equation  using  the  predicted  temperature 
and  salinity.  Two  more  sound  velocities  may  be  obtained  at 
this  observation  by  evaluating  these  two  equations  using  the 
observed  temperature  and  salinity  rather  than  the  predicted. 
Comparison  of  these  four  values  with  Wilson's  sound  velocity 
value  for  the  same  data  is  made  to  determine  the  adequacy  of 
the  regression  equations. 

The  comparisons  made  are  as  follows:  Wilson's  - 
Mackenzie's,  Wilson's  -  regression  sound  velocity,  and 
Mackenzie's  -  regression  sound  velocity,  using  the  observed 
temperature  and  salinity.  The  same  comparisons  are  made 
using  the  predicted  temperature  and  salinity. 

A  comparison  is  made  between  the  general  regression 
equation  where  depth  is  an  independent  variable  and  the  case 
where  equations  for  temperature,  salinity,  and  sound  velocity 
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are  built  at  each  depth  plane. 

The  reliability  of  the  Mackenzie  and  Wilson  equations 
will  be  discussed  in  Chapter  II  and  the  modification  to 
Mackenzie's  equation  needed  to  obtain  agreement  with 
Wilson’s  equation  will  be  discussed  in  Chapter  III. 

Data  was  made  available  on  punched  cards  by  Mr.  Richard 
Bolton  by  programs  to  decode  the  "Rapid  Access  Tape  Format 
Oceanographic  Station  Data"  system  developed  and  provided 
by  Mr.  Walter  E.  Yergen.^ 

The  cards  consist  of  3720  observations  for  latitude, 
longitude,  depth,  temperature,  salinity,  day-of-year,  and 
Wilson’s  sound  velocity  value  computed  from  these  variables 
using  a  procedure  described  in  Chapter  II. 

In  order  to  develop  more  meaningful  models ,  a  decision 

was  made  to  investigate  a  4°  by  4°  square  in  the  North  Atlantic 

Ocean  rather  than  several  2°  by  2°  squares  in  the  same  area. 

It  was  felt  that  if  adequate  prediction  equations  could  be 

built  for  this  area,  then  certainly  the  same  equations 

would  be  adequate  for  each  of  the  four  2°  by  2°  squares  con- 

o  o 

tained  in  the  4  by  4  square.  Since  excellent  prediction 
equations  were  obtained  for  the  4°  by  4°  square,  36°  -  40° 
north  latitude  and  68°  -  72°  west  longitude,  the  remainder 
of  the  study  was  devoted  to  investigating  2°  by  2°  squares 
around  this  4°  by  4°  square. 
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II.  REVIEW  OF  LITERATURE 


Many  sound  velocity  tables  have  been  developed  for 
both  distilled  water  and  sea  water.  N.  H.  Heck  and  J.  H. 
Service**  published  a  set  of  tables  in  1924,  which  were  based 
on  a  systematic  calculation  scheme.  In  1927,  D.  J. 

Matthews®  published  a  table  of  sound  velocity  calculations 
for  distilled  water  and  sea  water.  In  1939,  Matthews 
published  a  revised  edition  of  his  tables  after  the  improved 
set  of  tables  of  Kuwahara®  were  introduced  in  1938.  The 
revised  edition  of  Matthews  was  in  close  agreement  with 
Kuwahara,  but  the  Kuwahara  tables  are  considered  to  be  the 
better  of  the  two. 

The  Kuwahara  tables  motivated  the  development,  by 
several  individuals  and  organizations,  of  equations  to 
represent  this  data.  Three  of  the  better  known  and  more 
reliable  equations  developed  to  represent  the  Kuwahara 
tables  are  those  of  H.  V.  Mackenzie,  Wayne  D.  Wilson,  and 
V.  A.  Del  Grosso.^  The  Mackenzie  and  Wilson  equations  will 
be  discussed  in  some  detail  since  they  are  used  as  support 
in  the  substantiation  of  results  in  this  study.  Results 
of  Del  Grosso's  study  are  used  in  the  modification  of 
Mackenzie's  equations  to  reduce  residuals  at  upper  depths. 

The  basic  Mackenzie  equation  of  form 


TSD 


=  V 


0  a  3  S  )  0 


+  AV^  +  AVg  +  AVp  +  AV 


<P 


+  AV 


TSD 


(1) 


is  readily  seen  to  be  a  function  of  Temperature  (T), 

Salinity  (S),  Depth  (D),  and  Latitude  (cf),  absolute  value  of). 
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The  equation  consists  of  6  parts; 

1.  Reference  velocity,  computed  at  0°C,  35% 

salinity,  and  zero  depth. 

2.  Temperature  dependence ,  (AV^). 

3.  Salinity  dependence,  (AVg). 

4.  Depth  dependence,  (AVj^). 

5.  Latitude  dependence  (AV^). 

6.  Interaction  dependence  due  to  simultaneous  change  of 
T,  S,  D  (AV^gjj). 

AVtsd  broken  into  three  parts  AV^g,  AVgj^,  and 
AVtd  further  analysis  where 

a.  AV,pg  =  Temperature  -  salinity  interaction 

b.  AVgjj  =  Salinity  -  depth  interaction 

c.  AV^jj  =  Temperature  -  depth  interaction 

where 


1.  V 


0  >  3  5  »  0 


=  1445.5  M/S 


2.  AV„  =  4.6374  T  -  5.383x10"^  T^  +  2.543x10  ‘‘T 


3.  AVg  =  1.307(S-35) 

4.  AVjj  =  1.815x10"2D  -  5 . 291xl0“  ^ 

5.  AV^  =  1.5xlO"®D((})-35)  +0.94xl0"^^D  ((|)-35)^ 

9 

-2 . 94x10"^  ®D^  ((|)-3  5)  ^-1. 214x10"^  ((j)-3  5) 


> 

• 

CO 

TSD 

where 

a. 

AVts  = 

b. 

- 

c . 

AVtd  = 

V  +  V  +  V 
'^TS  ''SD  TD 


+T(6.95xlO"®D  -5.27x10"®D2  +2 . 7x10"^ ‘‘D®  ) 
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summing  the  results  (2)  -  (7)  give  the  result  (1). 

The  Mackenzie  equation  agrees  with  the  Kuwahara  tables 
to  within  .1  M/sec  everywhere,  but  it  should  be  noted  that 
the  equations  were  developed  to  fit  this  particular  data. 
This  is  not  to  say  that  the  equations  will  not  be  useful 
in  data  reduction  for  other  areas ,  but  one  should  not  be 
disappointed  in  finding  larger  residuals  between  Mackenzie's 
values  and  actual  readings  or  between  Mackenzie's  values  and 
Wilson's  values  for  the  same  data. 

Mackenzie's  equations  are  flexible  and  provision  is 
made  for  modification  if  necessary.  There  is  evidence  in 
the  analysis  to  support  the  fact  that  the  depth  dependency 
factor  and/or  the  latitude  dependency  factor  need  modifica¬ 
tion.  Experimentation  with  this  problem  will  be  discussed 
in  the  following  chapter. 

The  formulation  of  Wilson's  equation^  displays  the  same 
basic  form  as  Mackenzie's  equation;  that  is, 

V  =  1449.22  +  AV^  +  AVp  +  AVg  +  AVg^p  (8) 

The  main  differences  are  that  V  is  a  function  of  temperature ^ 
salinity  and  pressure,  where  pressure  is  a  function  of 
depth.  The  equations  were  developed  in  a  controlled  labor¬ 
atory  environment.  The  development  was  restricted  in  the 
assumption  that  99.5%  of  all  sea  water  falls  in  the  ranges 
of  -3°C  <  T  <  30°  for  temperature,  1.033  kg/cm^  <  P  <  1000.0 
kg/cm^  for  pressure,  and  33°/oo  <  S  <  37°/oo  salinity. 

The  equations  were  developed  over  581  laboratory  measured 
sound  speeds  for  fifteen  temperatures,  eight  pressures,  and 


five  salinities.  The  method  of  least  squares  was  applied, 
using  a  20x20  matrix  to  arrive  at  the  coefficients. 
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ing  depth  into  incremental  layers  and  summing  the  product 
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of  average  density  in  each  layer  times  the  thickness  of  the 
layer.  This  is  expressed  as  =EggJ^t  where  gg  is  the 
acceleration  due  to  gravity  at  latitude  6  and  at  the  mean 
depth  of  the  layer,  is  the  average  density  of  the  layer 
and  t  is  the  thickness  of  the  layer. 

A  more  complete  approach  for  determining  pressure  at 
depth  is  outlined  by  Walter  Yergen.®  The  development  is 
based  on  the  assumption  that  as  initial  conditions,  the 
surface  pressure  is  equal  to  the  mean  standard  atmospheric 
pressure  of  10.1325  decibars  and  that  the  initial  gravita¬ 


tional  attraction  g^  may  be  computed  as  a  function  of  Lati¬ 
tude  ( 6)  according  to 


g  =  .980616-2.5928x10”®  cos(2ej 
°o 

+  6.9xl0"®cos^(2e)  — 


(14) 


and  that  the  change  in  g  between  depths  is  given  by 

g.  =  g  +1.101  x  10"^(D.-D.  ).  (15) 

Since  pressure  is  a  function  of  density,  and  density  is 

not  explicitly  given,  an  approximated  density  at  is 

attained  by  successive  iterations  .  ,  A.  ,  .  .  .  ,  . 

In  theory  the  iteration  should  stop  when  the  difference 

I  ,  .,  —2'.  .1  <  e  where  e  is  some  predefined  tolerance. 

'  1, j+i  1,3 '  -  ^ 

Then  for  is  taken  to  be 

The  determination  of  pressure  (P^)  depth  is  an 
iterative  procedure  of  successive  alternating  approximations 
between  pressure  (P)  and  density  Z  in  the  sequence 


ii  X2  in’  in 


This  requires  that  an  initial  density  be  known,  and 
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initial  pressure,  which  is  assumed  to  be  10.1325  decibars. 

The  first  approximation  to  the  true  pressure  at  is  taken 

to  be  P.  =P. 

ii  1-1 

Once  the  first  density  approximation,  >  is  computed, 
then  equation  (16)  is  used  to  compute  the  first  approxima¬ 


tion  to  the  true  pressure. 


Pile  =  Pi-.  ^  ii'V“i-.>> 

where  from  (15). 


(16) 


Now  that  Pq»  known,  a  second  approxi¬ 
mation  to  the  true  density,  is  computed.  For  at 

depth  where  k  >  2,  the  following  expression  is  used, 

P^j^  =  (l+10"*a^)/R  (17) 

where  and  R  are  functions  of  temperature,  salinity,  and 
the  previously  computed  P.  ,  according  to  the  following 
relations 


10"^a^  =  (3.118633x10"®  +  4 . 5317157xlO"®T 

-5.4593903x10"“!^  -1 . 438 5354x10"^ ®T“ )/( 67 . 26+T) 


+a  (.001-4.7867x10"®!  +9.8185x10"®!^ 
o 

-  1.0843x10"®!®)  +a^  (1.803x10"®! 

-8.164x10"®®!^  +  1.667x10"^®!®)  (18) 

where  a  =  -9 . 3445863x10“^  +.81487658S  -4 . 8249614xl0"“S^ 
o 

+6.7678614xl0"®S®  (19) 

and  R  =  l-[4.886xl0"®P/(l+1.83xl0"®P)] 

+P[-22072xl0"^+3. 673x10"®!  -  6.63x10"^®!^ 

+4x10"^^!®  +  a  (1.725xl0"®-3. 28x10"^®! 
o 

+4x10"^  2.J.2)  +ct2(_4.5x10"^®+10”®=^!)] 

o 

*P^C-6. 68x10"^'*  -1.24064x10"^^!  +2 . 14xl0"^ 
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+  0  (-4 .248x10"^  ^+1. 206x10"  ^‘*1-2x10" 
o 

+a^(1.8xl0"^ ®-6xlO"^^T)]  +  P^1.5xlO"^^T  (20) 

o 

where,  for  the  second  approximation  to  density  5.^  P=  • 
Now  find  P.  by  using  (16)  with  k  =  2.  This  back  and 

1  5  2 

forth  iteration  is  continued  until  |P.  -P.  I  <  e.  The 
data,  however,  is  somewhat  inaccurate  and  warrants  no  more 
than  three  iterations  as  a  best  approximation  to  the  true 
pressure  at  D..  Hence  P.  is  used  in  (10),  (11),  (12),  (13) 

1  l3 

for  finding  sound  velocity.  Note  that  since  the  above 

pressure  is  in  decibars,  the  conversion  P  =  .101971P^^  must 

be  made  before  use  in  Wilson's  equation.  If  the  velocity  is 

desired  in  feet  per  second  Vj-  .  ,  =  V  .  ,  .(3.28083) 

^  feet/sec  meters/sec 

yields  the  desired  result. 

Wilson  and  Del  Grosso  concluded  from  careful  laboratory 

measurements  that  the  reference  velocity,  V  ,  used  by 

0  >  3  5  >  0 

Kuwahara  in  constructing  the  Kuwahara  tables  is  low  by  about 
3  m/sec,  particularly  at  upper  depths  where  pressure  is 
lower.  A  comparison  of  Wilson's  predicted  values  and  the 
values  predicted  by  Kuwahara,  substantiates  this  3  m/ s 
differential  from  0  to  100  kg/cm^  pressure.  The  reference 
velocity  in  Mackenzie's  equation  (8)  will  be  low  by  3  m/ s 
also  since  the  equation  was  constructed  to  fit  the  Kuwahara 
tables.  The  3  m/sec  differential  in  Kuwahara 's  values  at 
atmospheric  pressure  is  concluded  to  be  a  result  of  slightly 
erroneous  data  on  the  compressibility  of  water. ^ 

Comparing  the  results  of  Mackenzie's  and  Wilson's 
equations  when  applied  to  oceanographic  data ,  oth^r 
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than  that  for  which  the  equations  were  developed,  substan¬ 
tiates  the  3  m/s  difference  at  near  atmospheric  pressures 
and  below.  Wilson's  equation  predicted  values  almost  con¬ 
sistently  3  m/s  higher  than  Mackenzie  for  this  data,  par¬ 
ticularly  for  depths  to  500  meters.  Beyond  this  depth  there 
is,  roughly,  a  linear  decrease  in  the  differences  of  sound 
velocity  predicted  by  the  two  equations  at  the  same  temper¬ 
ature,  salinity,  depth  observation.  The  two  equations,  at 
2000  meters  are  in  excellent  agreement.  Figure  1  shows, 
roughly,  the  plot  of  residuals  from  depth  0-2500  meters. 


Differences  (m/s) 


Figure  1.  Plot  of  the  residuals  between  Wilson's 
values  and  Mackenzie's  values  for  the 
same  data .  (  r^  = 

The  values  yielded  by  Wilson's  equations  are  used 
extensively  in  checking  the  results  of  this  study  since  these 
values  are  considered  to  be  good  for  most  applications  in  the 
physical  sciences.^  Mackenzie's  equation,  however,  is  easier 
to  use  since  there  is  no  pressure  dependency  term.  The 
modification  to  Mackenzie's  equation,  to  be  discussed  in  the 
following  chapter,  is  warranted  on  the  basis  of  its  ease 
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of  use  and  by  the  fact  that,  for  the  areas  considered,  the 
differences  from  Wilson's  values  were  no  greater  in  absolute 
value  than  .9  meters/second  for  all  depths.  Both  equations 
are  used,  however,  in  checking  the  results  of  the  regression 
equations  developed  in  this  investigation. 

Data  deficiencies  are  always  an  impediment  in  the  solv¬ 
ing  of  oceanographic  problems.  C.  J.  VanVliet  has  made  a 
rather  extensive  empirical  study  on  the  effect  of  random  and 
nonrandom  missing  data  on  regression  and  autocorrelation 
analyses  of  time  series  data.^®  The  time  series  analysis  is 
to  isolate  trend  or  a  gradual  increase  or  decrease  in  a  sys¬ 
tem  over  a  long  period  of  time,  oscillation  or  a  variation 
about  the  trend  which  occurs  with  a  pattern  of  regularity 
over  a  period  of  time,  and  random  elements  or  unpredictable 
variations  in  a  given  variable. 

Van  Vliet  considered  the  surface  temperature  variable 

in  his  analysis.  The  Monte  Carlo  method  was  employed  to 
I 

simulate  missing  data  situations,  random  and  nonrandom.  The 
regression  and  autocorrelation  coefficients  were  computed 
for  each  time  series  analysis. 

A  determination  of  the  sensitivity  of  coefficient 
variability  due  to  random  and  nonrandom  missing  data  was 
made  for  different  series  lengths.  The  conclusion  was  that 
if  the  missing  data  is  random,  a  smaller  sample  size  is 
used,  and  the  change  in  the  variability  of  the  regression 
coefficients  is  predictable  by  the  amount  of  reduction  in 
sample  size.  The  random  deletion  of  data  increases  both 
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regression  and  autocorrelation  coefficient  variability. 

For  nonrandom  missing  data,  or  an  excessive  number  of 
longer  sequences  of  missing  data,  the  increase  in  the 
variance  of  the  regression  coefficients  is  roughly  twice  the 
increase  for  random  missing  data.  For  nonrandom  missing 
data  the  increase  in  variance  of  the  autocorrelation  coef¬ 
ficients  is  roughly  1.2  times  the  increase  attributable  to 
random  missing  data.  The  above  suggests  that  the  auto¬ 
correlation  coefficients  are  less  sensitive  to  the  effects 
of  nonrandom  missing  data  than  the  regression  coefficients. 


E.  R.  Anderson,  using  regression  and  autocorrelation 
techniques ,  determined  that  in  order  to  eliminate  short  term 
variability  and  reliably  estimate  sea-surface  temperature,  a 
time  series  record  of  8  to  10  years  is  needed. Anderson 
developed  a  regression  model  considering  latitude,  longi¬ 
tude,  and  day  of  year  as  independent  variables.®  This  model 
was  found  capable  of  estimating  seasonal  variation  of  sea- 
surface  temperature  off  the  west  coast  of  the  United  States, 
in  water  depths  of  greater  than  100  fathoms,  to  a  standard 
deviation  of  less  than  1°F.  Anderson’s  model:  T  =  F 

O 

(Latitude,  Longitude,  day-of-year ) . 


T  =3  +  3  D  +  3  D*  +  3  D®  +  3  D**  +  3  D® 

s  *^0  \  y  2  y  3  y  4y  sy 


+  3  L^  +  3  L^  +3  hi 

6  6.  76  8  ^ 

+3  L^  +  3  L^  +3  L® 

9O  IQO  llO 

+3  L  D  +3  L  D®  +  L  D® 

i2^y  i3^y 

+  3  L  D  +3  L  +  3  L  D® 

^j^goy  i6^y  17^  y 


(day-of- 

year) 

(latitude) 

(longitude) 

(latitude- 

day) 

(longitude- 

day) 


16 


where 


L 


+ 

a  L  L  + 

3 

L  + 

i  8  a  o 

1  9 

a  o 

+ 

6  L^L  + 

6 

L^L 

2j  a  o 

2  2 

a  o 

=  latitude 


(latitude- 

longitude) 


a 

=  longitude 
Dy  =  Day-of-year 

It  should  be  pointed  out  that  the  present  study  is 
primarily  a  search  for  adequate  models  to  represent  temper¬ 
ature,  salinity,  and  sound  velocity  and  the  data  used  is 
primarily  from  one  area  and  one  season.  The  seasonal 
variation,  therefore,  will  not  be  as  pronounced  as  in 
Anderson's  study.  The  terms  of  Anderson's  model,  however, 
are  incorporated  into  one  of  the  more  complex  temperature 
models  to  be  discussed  in  Chapter  3.  This  model  is  also 
expanded  to  include  depth  as  an  independent  variable.  It 
is  hoped  that  this  technique  will  help  in  explaining  varia¬ 
bility  of  temperature  to  depths  of  500  meters. 
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III.  DISCUSSION 


The  extremely  dynamic  character  of  the  ocean  environ¬ 
ment  is  a  formidable  obstacle  in  the  search  for  stable 
techniques  for  predicting  ocean  variables.  The  oceano¬ 
graphic  problem,  then,  becomes  one  of  searching  for 
"adequate”  models  to  use  in  reduction  of  available  data. 

This  chapter  presents  the  results  of  a  preliminary 
inquiry  into  the  feasibility  of  eliminating  the  need  for 
”on-location"  measurements  of  temperature  and  salinity  by 
building  multiple  regression  models  to  predict  these  vari¬ 
ables  as  functions  of  geographic  location,  time,  depth,  and 
day-of-year. 

The  regression  development  consists  of  a  systematic 
consideration  of  polynomial  models  of  the  form 


where 


Y  =  Z  6.  X.  +e 
i=i  ^  ^ 


a  a 
X.  =  f(Z  Z  ^ 


a 

) 


such  that:  Z^  are  independent  variables, 

and:  are  powers  of  the  independent  variables 

and:  e  is  the  error. 

The  models  tried  vary  in  complexity,  from  second  order 
models  with  only  two  independent  variables  (latitude  and 
longitude),  to  tenth  order  models  with  6  independent  vari¬ 
ables  (latitude,  longitude,  depth,  temperature,  salinity, 
day  of  year).  The  investigation  proceeded  from  producing 
models  for  individual  depth  planes  to  a  general  regression 
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situation  for  which  depth  was  an  independent  variable.  The 
final,  more  involved  model,  for  temperature,  contains  the 
terms  of  the  model  described  by  E.  R.  Anderson’  to  account 
for  seasonal  variation  in  temperature. 

Mackenzie’s  equation  proves  to  be  a  valuable  tool  for 
comparing  regression  results  with  some  existing  standard; 
however.  Figure  1  reveals  a  deficiency  in  predicting  par¬ 
ticularly  at  depths  to  1500  meters.  The  nearly  linear  de¬ 
crease  in  the  magnitude  of  the  residuals,  r^  =  w^-m^,  where 

w.  is  Wilson’s  value  and  m.  is  Mackenzie’s  value  at  observa- 
1  1 

tion  i,  suggests  a  slight  modification  in  the  depth  depend¬ 
ency  term  is  in  order.  The  reference  velocity  is  taken  as 

that  of  Del  Grosso^,  V  =  1448.5,  and  an  amount  .0012D 

0  j  3  s  >  0 

is  subtracted  from  the  depth  dependency  term.  That  is,  now: 
AVp=  1. 815xlO"^D-5. 291x10”^ -1.2x10“’D  =  1.63xlO"^D 
-5.291x10“^^D’ 

Notice  that  at  upper  depths  the  change  in  the  depth  depend¬ 
ency  term  is  negligible,  but  since  the  reference  velocity 
is  3  m/s  greater,  Mackenzie’s  equation  predicts  very  close 
to  Wilson’s.  As  depth  increases,  the  depth  dependency 
change  becomes  more  pronounced,  until  at  2500  meters  the 
effect  of  the  higher  reference  velocity  is  cancelled  (i.e., 
.0012(2500)=3) ,  and  Mackenzie’s  equation  is  predicting  as  it 
was  originally. 

Figure  2  shows  a  plot  of  the  residuals  after  modifica¬ 
tion  of  Mackenzie’s  equation  and  Figure  3  shows  the 
distribution  of  residuals  by  means  of  a  histogram,  after 
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this  modification. 

The  magnitude  of  3718  of  the  3720  residuals  obtained 
were  of  the  order  |r^|<0.9.  The  two  residuals  whose  value 
was  greater  than  1.0,  were  found  at  a  zero  salinity  reading. 
Clearly,  for  this  area,  Mackenzie's  equation  is  much  im¬ 
proved,  and  will  be  very  beneficial  for  comparing  to  regres¬ 
sion  sound  velocity  predictions. 


Residual  (m/s) 


Figure  2  repre5ents  a  plot  of  the  average  residual 
(r •  -  W"  —  m.)  at  depth  The  plot  does  not  show  the 

residuals  which  reached  larger  values  (e.g.,  >  .5).  For  this 
reason  the  distribution  is  shown  in  Figure  3  as  a  histogram. 
The  plot  is  shown  as  number  of  residuals  against  magnitude 
of  residual.  For  example,  the  number  of  residuals  from  0.0 
to  0.1  is  428.  Alternating  positive  and  negative  residuals 
lower  the  value  of  the  average  r^  at  in  Figure  2. 


Depth 

(x  100  meters) 
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After  modification,  Mackenzie's  equation  was 
a  1°  by  1°  square  of  data  from  38°-39°  N  latitude 
W  longitude.  This  run  substantiated  the  validity 
modification,  for  all  the  residuals  (w^  -m^)  here 
the  range  of  -.5  m/s  to  .9  m/s. 


checked  on 
and  69°-70° 
of  the 
fell  in 
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Figure  3.  Histogram  of  residual  distribution  of 

r.  =  w.-m.  for  the  same  data  in  the  4°x4° 
square^36^-40°  N  latitude  68°-72°  W  longi¬ 
tude  . 


Throughout  the  remainder  of  this  discussion,  w^  and  m^ 
will  represent  Wilson's  and  Mackenzie's  sound  velocity, 
respectively,  as  before,  and  will  represent  the  sound 
velocity  yielded  by  the  regression  equation. 

The  stepwise  multiple  regression  procedure  was  utilized 
in  building  polynomial  models  involving  two  to  six  independ¬ 
ent  variables  and  various  higher  order  cross  products  of 
these  variables  in  ascending  order  of  complexity.  The 
greatest  significance  is  attached  to  the  more  complex  models 
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toward  the  end  of  the  study  and  therefore  the  most  compre¬ 
hensive  analysis  is  reserved  for  those  models  described  on 
pages  31  -  34. 

The  set  of  data  used  is  from  the  4°  by  4°  square  38°  - 
40°  N  latitude,  68°  -  72°  W  longitude,  consisting  of  3720 
data  points  over  20  depth  planes  of  0,  10,  20,  30,  50,  75, 
100,  150,  200,  250,  300,  400,  500,  600,  800,  1000,  1200, 
1500,  2000,  2500  meters. 

Some  arbitrary  criterion  must  -be  established  for 
measuring  how  well  the  regression  equations  appear  to  be  in 
the  analysis.  This  may  be  achieved  in  several  ways.  This 
investigator  will  use  three  common  criterion  for  determining 
goodness  of  fit.  First,  and  probably  most  important,  is  the 
R*  ratio  or  percent  of  variation  explained  by  the  regression 
equation;  second,  the  standard  error  of  the  regression 
equation;  and  third,  plots  of  the  residuals  (deviation  from 
actual  value)  against  the  dependent  variable  (y).  Ideally, 
we  wish  to  increase  R^  as  we  decrease  the  standard  error  of 

The  stepwise  procedure  requires  a  significance  level 
for  the  deletion  of  non- significant  terms  from  the  model  and 
the  addition  of  significant  terms.  In  most  of  the  ensuing 
models,  an  F  level  of  2.65  is  used  for  adding  and  deleting 
variables  in  the  model  building  process.  This  figure  repre¬ 
sents  F(1,V2,.90)  where  Va  >  120  degrees  of  freedom. 

When  plotting  the  residuals  (y^  ”  ^i^  against  y,  four 
common  patterns  may  appear  signifying  certain  conditions  of 
the  prediction  equation  over  the  range  of  the  dependent 
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variable.  Figure  4  shows  the  general  shape  of  these  pat¬ 
terns.  Variations  of  shape,  slope,  and  combinations  of  more 
than  one  are  possible. 


A.  B.  C.  D. 


Figure  4.  Possible  patterns  of  residual  plot  of  y--^. 

against  i  i 

Interpolation  of  the  cases  is  as  follows 

A.  Residuals  fall  in  a  horizontal  band  indicate  no 
unaccounted  for  effects  over  the  range  of  the 
dependent  variable  This  indicates  a  normal 
regression  situation  and  good  fit. 

B.  Residual  plot  forms  a  fan  pattern  indicating  the 
variance  is  not  constant  but  increases  with  increas¬ 
ing  values  of  the  dependent  variable.  This  implies 
weighted  least  squares  analysis  should  be  used 
instead . 

C.  Band  with  slope  greater  than  zero  indicating  that  a 
linear  term  is  needed  in  the  model. 

D.  Nonlinear  band  indicates  linear  and  quadratic  terms 
are  needed  in  the  model. 

This  type  of  analysis  will  be  applied  to  more  significant  ' 


models . 
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Second  Order  Model  -  Two  Independent  Variables 

The  second  order  model  was  the  simplest  of  all  models 
tried.  The  purpose  was  to  determine  if  temperature,  salinity, 
and  sound  velocity  are  functions  of  geographic  location  . 
(latitude  and  longitude).  Temperature,  salinity  and  sound 
velocity  are  used  as  independent  variables.  Depth  is  not 
an  independent  variable  here,  consequently  the  model  is 
applied  to  the  data  at  each  depth  plane  for  each  dependent 
variable. 

Model  1  used  here  is  as  follows: 

6  +3  Z  +3  Z^+3  Z  +3  Z  Z  +3  Z^Z  +3  Z^+3  Z  Z^+3  Z^Z* 

0  11  21  32  412  512  62  712  6  1  2 

where:  Z  =  latitude 
1 

Z  =  longitude 
2 

Figure  5  shows  the  and  corresponding  standard  error  a,j, 
of  Tfor  each  depth  plane  when  temperature  is  the  dependent 
variable.  Figures  6  and  7  show  plots  of  the  statistic 
and  corresponding  standard  error,  for  each  depth  plane, 
where  salinity  and  sound  velocity  are  the  dependent  vari¬ 
ables,  respectively. 

An  examination  of  the  residuals  from  the  resulting  equa¬ 
tions  and  the  plots  in  figures  5 ,  6 ,  and  7  reveal  deficien¬ 
cies  in  Model  1.  The  residuals  (actual  -  predicted)  are 
generally  in  the  ranges  of  +5°C  for  temperature,  +8°/^^  for 
salinity,  and  +50  meters  per  second  for  sound  velocity.  These 
residuals  are  too  large  in  comparison  to  the  magnitude  of  num¬ 
bers  being  predicted  and  indicate  an  obvious  need  for  more 
independent  variables  and/or  higher  order  cross  products  in 


the  model. 


FIGURE  5  R  VERSUS  Op  FOR  MODEL 


BAD  DATA  SCREENED 


FIGURE  6 .  R  VERSUS  (Js  FOR  MODEL  I 
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A  further  check  was  performed  on  the  results  of  Model  1 
by  evaluating  Mackenzie’s  equation  using  the  temperature  and 
salinity  yielded  by  the  regression  equations  rather  than  the 
true  temperature  and  salinity.  The  sound  velocity  obtained 
by  Mackenzie’s  equation  in  this  manner  was  then  compared  to 
Wilson’s  sound  velocity  value  and  then  to  the  sound  velocity 
predicted  by  the  regression  sound  velocity  equation  for  the 
same  data.  Mackenzie’s  equation  utilizing  the  calculated 
temperature  and  salinity  displayed  severe  differences  from 
Wilson  values.  In  many  cases  the  differences  were  800 
m/sec!  The  differences  between  Mackenzie’s  prediction  and 
the  regression  equation  prediction  were  even  more  severe. 
Some  of  the  differences  here  reached  900  m/sec.  The  inade¬ 
quacy  of  Model  1  was  substantiated,  and  a  more  expanded 
model  was  tried. 

Much  of  the  inadequacy  of  Model  1  and  the  rather  wild 
results  obtained  in  the  analysis  is  attributable  to  missing 
data  resulting  from  such  things  as  instrument  failure  or  bad 
weather.  To  eliminate  as  much  of  the  effect  of  missing 
data  as  possible,  a -screening  is  implemented  so  that  if  a 
zero  temperature  or  salinity  reading  is  encountered,  it  is 
essentially  eliminated  from  the  discussion.  Figure  6  shows 
the  effect  of  screening  out  bad  data. 

Second  Order  Model  -  Four  Independent  Variables 

In  this  facet  of  the  study  the  regression  model  was 
expanded  to  include  four  independent  variables,  latitude. 


28 


longitude,  day  of  year,  and  time  of  day. 

The  introduction  of  additional  independent  variables 
greatly  increases  the  possible  combinations  of  cross  prod¬ 
ucts  which  could  be  considered  to  enter  the  model.  A 
judicious  choice  was  made  and  the  resulting  model  was: 

3  +3  Z  +3  Z^+3  Z  Z  +3  Z  Z  +3  Z  Z  +3  Z  +3  Z^ 

0  11  21  312  413  514  62  72 

+3  Z  Z  +3  Z  Z  +3  Z  +3  z2+3  Z  Z  +3  Z  +3  Z^ 

823  924  103  113  1234  134  154 

+3  ZZZ+3  ZZZ+3  ZZZ+3  ZZZZ+s  (Model  2) 

16124  17134  18234  191234 

where  Z  =  day-of-year,  Z  =  time-of-day,  Z  =  latitude, 

12  3 

Z  =  longitude . 

4 

Higher  order  terms  were  arbitrarily  avoided  at  this 
point  to  minimize  the  complexity  of  the  problem  in  the  early 
stages.  Notice,  however,  Z^,  Z^,  Z^ ,  Z^  have  been  included. 

12  3  4 

Seven  depth  planes  were  chosen  for  the  analysis;  0,  10, 
20,  50,  100,  500,  1500  meters.  Model  2  was  applied  to  the 
data  at  each  depth  plane  for  each  of  the  dependent  variables 

^  /V  /\ 

temperature,  salinity,  and  sound  velocity.  T,  S,  SV  were 
determined  at  each  depth  plane  with  the  90%  F  of  2.65. 

Table  I  shows  the  statistic  and  corresponding  standard 
error  for  each  regression  equation  at  each  depth  plane  con¬ 
sidered  . 
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Depth 

in 

meters 

6 

R^ 

h 

^t 

( 

R^ 

/V 

S) 

0 

s 

A 

iS} 

R^ 

/) 

‘^sv 

0 

.666 

3.07 

.378 

13.09 

.596 

53 .1 

10 

.721 

2.92 

.346 

J 

13.75 

.512 

62.9 

20 

.813 

2.56 

.736 

.61 

.627 

53.9 

50 

.806 

3.15 

.705 

.62 

.640 

64.5 

100 

.774 

2.57 

.751 

.37 

.627 

60.4 

500 

.812 

2.98 

.813 

.35 

.756 

50. S 

1500 

.552 

.39 

.185 

.073 

.265 

31.4 

Table  I.  The  and  corresponding  standard  error  for 
all  equations  developed  using  Model  2  at 
depth  planes  indicated. 

The  residuals  associated  with  the  regression  equations 
at  the  various  depth  planes  still  showed  excessively  large 
deviations  from  the  observed  values.  Residual  patterns  were 
similar  to  those  of  Model  1.  The  residuals  for  the  equa¬ 
tions  derived  from  Model  2  still  were  generally  in  the  range 
of  +50C  for  temperature,  +B°/oo  salinity,  and  +50m/sec 

for  sound  velocity.  Calculation  of  Mackenzie’s  equation 
using  the  calculated  temperature  and  salinity  and  comparing 
to  Wilson’s  and  the  regression  sound  velocity  for  the  same 
data  showed  no  significant  improvement  over  results  from 
Model  1.  Further  comment  on  this  particular  model  is 
deferred  until  more  comprehensive  models  have  been  discussed. 

In  the  general  regression  situation  it  was  desired  to 
create  a  model  which  involves  as  many  significant  independ¬ 
ent  variables  and  cross  products  as  possible  while  at  the 
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same  time  containing  as  few  terms  as  possible  to  do  a 
responsible  job  of  predicting. 

Depth  should  have  a  significant  relationship  to 
salinity  and  sound  velocity.  This  introduces  the  general 
problem  of  developing  regression  equations  for  temperature, 
salinity,  and  sound  velocity  over  all  depth  planes. 

A  reassessment  of  the  basic  problem  reveals  two  un¬ 
answered  questions.  First,  is  it  possible  to  develop 
regression  equations  to  adequately  predict  temperature  and 
salinity  values,  which  could  then  be  used  in  an  existing 
equation,  such  as  Wilson’s  or  Mackenzie’s  equation,  to  yield 
a  sound  velocity  value  near  the  true  value  without  the  need 
for  ”on-location”  measurements  of  temperature  and  salinity? 
Second,  if  adequate  temperature  and  salinity  equations  can 
be  developed,  could  a  regression  equation  for  sound  velocity 
then  be  used,  utilizing  these  predicted  values,  to  produce 
sound  velocities  close  to  the  true  reading  without  relying 
on  existing  methods  such  as  Mackenzie’s  or  Wilson’s  equa¬ 
tion?  The -most  important  aspect  in  either  case  is  eliminat¬ 
ing  the  need  for  actual  measurement  by  instruments. 

There  are  at  least  two  procedures  which  may  be  used  in 
developing  the  desired  regression  equations  for  temperature, 
salinity  and  sound  velocity.  First,  one  large  model  may  be 
used,  changing  only  the  dependent  variable.  Second,  an 
individual  model  for  each  dependent  variable  may  be  used. 

It  was  concluded,  after  extensive  model  testing,  too 
voluminous  to  present  here,  that  the  individual  character 
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of  the  dependent  variables  temperature,  salinity  and  sound 
velocity  require  individual  models. 

Thus  the  models  presented  in  the  ensuing  discussion 
were  built  according  to  procedure  two,  and  yield  better 
results  than  those  models  tested  by  procedure  one.  It 
should  also  be  pointed  out  that  the  temperature,  salinity 
and  sound  velocity  models  presented  in  the  following  dis¬ 
cussion  are  the  culmination  of  an  extensive  trial  and  error 
model  building  process.  These  are  the  models  which  produced 


the  most  significant  results. 

S  =  Salinity  =  F  (latitude,  longitude,  depth) 

=  3  +3  Z  +  e  2^+3  Z  Z  +3  Z  Z  +3  Z  Z  +3  Z 

0  1  1  2.  X  311  312  413  52 

+3  Z^+3  Z  Z  +3  Z  +3  Z^+3  Z®+3  Z^+3  Z® 

62  723  83  93  101  112  123 

+  3  Z‘^  +  3  Z''  +  3  Z‘'  +  3  Z  =  +  3  Z®  +  3  Z®+3  Z® 

1  3  1  142153  161  172  183  19  1  \IIlOCl6X  o) 

+  3  Z‘*Z''  +  3  Z'*Z‘*  +  3  Z®  +  3  Z‘*Z'*+3  Z®  +  3  Z®Z^ 

20  1  2  21  1  3  22  2  23  2  3  24  3  25  1  3 

+  3  Z®zH3  Z®+3  Z®Z^  +  3  Z®  +  3  Z^®  +  3  Z®Z® 

26  2  3  27  1  28  1  2  292  301  31  1  2 

+3  Z®Z®+3  Z^°+3  Z®Z®+3  Z^°+3  /Z  +3  /Z^ 

32  1  3  33  2  34  2  3  353  36  3  37  3 

+3  /Z®+e 

3  8  3 

where  Z  =  latitude,  Z  =  longitude,  Z  =  depth.  Application 

12  3 

of  this  model  to  the  .available  data  yielded  the  following 


prediction  equation. 

S  =  -1.096Z  +.1635xlO“‘*Z  Z  -IxlO^'Z^t .  253xlO“®Z^ 

1  13  3  3 

+  .  959xlO“®Z^+.614xlO"^  °Z?Z|-. 848x10“^ ^Z^Zf 
+  .  976x10“ ®Z^Z^+. 62x10"^  ® Z ®- . 196x10“^ ^Z ®Z ® 

12  2  13 

+.733x10“2®Z®Z®+.8x10“®®Z^°-9.6133/Z  +8 . 6/Z ® +68 . 57 

2  3  3  3  3 

The  prediction  equation  represents  a  relatively  gdod 
statistical  fit  to  the  salinity  observations  on  cards. 
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Nearly  all  residuals  (S^  -S^)  fall  in  the  range  -1.5  to  1.5, 

A. 

and  =  64.14  with  the  standard  error  of  S  =  .7375. 

The  temperature  model  is  more  complex  since  it  in¬ 
volves  two  more  independent  variables  than  the  salinity 
model,  and  incorporates  the  terms  of  Anderson's  model®  to 
account  for  seasonal  variation.  The  model  is  expanded  to 
include  depth  and  day  of  year  as  independent  variables . 

T  =  Temperature  =  F  (latitude,  longitude,  depth, 

salinity,  day  of  year) 

=  3  +3  Z  +3  Z2+3  Z  Z  +3  Z'Z  +3  Z  Z  +3  Z  Z 

0  II  21  312  413  514  615 

+3  z  +3  Z2+3  Z  Z  +3  Z  Z  +3  Z  Z  +3  Z 

72  82  923  1024  1125  123 

+  3  ZH3  Z  Z  +3  z  Z  +3  z  +3  Z2  +  3  Z  Z  (Model  4) 

13  3  14  3  4  1  5  3  5  1  6  4  1  7  4  1  8  4  5 

+  3  Z  +3  Z'‘  +  3  Z®  +  3  Z®  +  3  Z®+3  Z®  +  3  Z® 

19  5  20  5  21  1  22  2  23  3  24  4  25  5 

+  3  e^H3  Z®Z  +3  Z‘'  +  3  Z''  +  3  z‘'  +  3  Z  Z® 

26  2712  283  294  305  3115 

+3  z  Z®+3  z  Z®+3  Z®+3  Z®+3  Z  Z®+3  Z  Z® 

3225  3312  344  355  3615  3725 

+  3  ln(Z  )  +3  e^^  +  3  Z  +  ^  /Z^  +  3  Z^Z 

38  3  39  4012  41  4  4212 

+3  Z  /Z  +e 

4  3  4  3 

where  Z  =  latitude 

1 

Z^  =  longitude 
Z  =  depth 

3 

Z  =  salinity 

4 

Z^  =  day-of-year 
e  =  error  term 

Notice  that  some  experimental  cross  products  are 
included  in  the  model.  It  is  interesting  to  note  that  some 
of  these  odd  terms  entered  the  resulting  regression  equation 
at  high  levels  of  significance.  Applying  this  model  to  the 


33 


data  resulted  in  the  following  prediction  equation  for 
temperature . 

T  =  -34.96  +  .19xlO"^Z  Z  +  .246xlO“^Z  Z  -.018Z  Z 

13  15  2  4 

-.leZxlO'^Z^  +  .477xlO"^Z  Z  - .  3599xlO"‘*Z  Z 

3  3  4  3  5 

-.884xlO”^Z  Z  +  .492x10"2z^  -.187x10“®Z® 

4  5  1  2 

+  .78xlO"®Z®  +  .1466x10"^  “e^'*  +  .  746xlO"‘*Z®Z 

3  12 

-.13xl0"“z‘*  -.513xlO"^Z  Z^  -.584x10"^2Z  Z® 

3  2  5  1  5 

+  5.62  ln(Z  )  -.423Z  /Z 

3  if  3 

This  equation  represents  a  good  fit  to  the  3720  temper¬ 
ature  observations  on  cards.  For  this  set  of  data,  R*=.9484 

A. 

and  the  standard  error  of  T  =  2.32.  The  vast  majority  of 

A 

residuals  (T^  “T^)  fall  in  the  range  +2°C  from  the  observed 
value. 

Finally,  the  sound  velocity  model  used  to  fit  the  3720 
sound  velocity  observations  is  a  function  of  five  independ¬ 
ent  variables . 

SV  =  sound  velocity  =  F  (latitude,  longitude,  depth, 

temperature,  salinity) 

=  6  +6  Z  +3  z2+3  Z  Z  +3  Z  Z  +3  Z  Z  +3  Z  Z 

0  11  21  312  413  514  615 

+3  Z  +3  z2+3  Z  Z  +3  Z  Z  +3  Z  Z  +3  Z 

72  82  923  1024  1125  123 

+  3  Z''  +  3  Z  Z  +3  Z  Z  +3  Z  +3  +  B  Z  Z 

133  1434  1535  164  174  1845 

+3  Z  +3  Z2+3  ZZZ+3  ZZZ+3  ZZZ 

19  5  20  5  21  1  2  3  22  12  4  23  1  2  5 

+3  ZZZ+3  ZZZ+3  ZZZ+3  ZZZ  (Model  5) 

24  134  25  135  26  145  27  234 

+3  ZZZ+3  ZZZ+3  ZZZ+3  Z  Z^Z 

26235  29245  30345  31123 

+  3  ZZZZ+3  ZZZZ+3  ZZZZ 

321245  331345  342345 

+3  ZZZ  Z^+e 

351345 

where  Z  ^  =  latitude 
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=  longitude 
=  depth 

Z^  =  temperature 
Z^  =  salinity 

and  temperature  will  be  a  function  of  day-of-year. 

Applying  this  sound  velocity  model  to  the  available 
data  yielded  the  following  prediction  equation: 

SV  =  (4894.08  +  .0222Z2  +  .112Z  Z  -.1127Z  +.373xlO"’'Z^ 

112  3  3 

+  .65xlO"^Z  Z  -.103Z^  +  3.58Z  +  . 015Z^- . 0052Z  Z  Z 

35  t  5  5  121* 

+  .59xlO"'»Z  Z  Z  +.685xlO"®Z  Z  Z  +.799xlO"®Z  Z  Z  Z 

134  123  1245 

-.226xlO"^Z  Z  Z  Z^)/3.281 

13  4  5 

This  sound  velocity  equation  is  a  very  good  fit  to  the 

data  with  =  .9935  and  98%  of  the  residuals  (SV^ 
fall  in  the  range  of  +2  m/sec.  The  standard  error  of 

SV  =  2.9  m/sec. 

The  method  by  which  these  equations  were  derived 
presents  an  interesting  possibility.  A  sound  velocity  value 
could  be  computed  knowing  only  latitude,  longitude,  depth, 
and  day-of-year,  since 

Salinity  =  F(lat,  Ion,  depth) - 

Temperature  =  F(lat,  Ion,  depth'i  Salinity,  day-of-year) 
Sound  velocity  =  F(lat,  Ion,  depth,  teJhperature ,  salinity) 

1_ _ I 

There  are  now  five  sound  velocity  values  for  each 
latitude,  longitude  and  depth. 

1.  Wilson's  value  (given  in  initial  data) 

2.  Mackenzie's  value  computed  using  the  observed  tem¬ 


perature  and  salinity. 
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3.  The  regression  equation  value  (SV)  using  the 
observed  temperature  and  salinity. 

4.  Mackenzie's  value  computed  using  the  predicted 
temperature  and  salinity. 

5.  The  regression  equation  value  using  the  predicted 
temperature  and  salinity. 

For  each  of  the  3720  latitude,  longitude,  and  depth 
observations,  these  five  sound  velocity  values  were  obtained. 
With  these  five  sound  velocities,  six  comparisons  were  made 
for  each  data  point. 

1.  r  ^  =w. -m.  (Wilson's  -  Mackenzie's) 

wm  1  1 

2.  r^g  “'^i“®i  (Wilson's  -  Regression  S.V.) 

3.  r^g  =m^-B^  (Mackenzie's  -  Regression  S.V.) 

using  the  predicted  temperature 
and  salinity 

Six  corresponding  residual  distributions  were  developed 
according  to  the  magnitude  of  the  residual.  The  purpose  of 
the  distributions  is  to  determine  how  many  of  the  residuals 
are  more  than  30  m/sjec  high,  29-30  m/sec  high,  .  .  .,  29-30 
m/sec  low,  more  than  30  m/sec  low.  Table  II  shows  the  six 
residual  distributions  and  their  densities. 

Using  the  observed  temperature  (T)  and  salinity  (S),' 
Wilson  and  Mackenzie  show  hardly  any  difference  as  would  be 
expected  after  modification  of  Mackenzie's  equation. 

Using  the  observed  (actual)  T  and  S  the  residual  dis¬ 
tribution  for  w^-B^  shows  98%  of  the  residuals  are  in  the 


5- 


using 
observed 
^tempera- 
’ture  and 
salinity 
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range  +2  m/sec.  This  indicates  a  good  fit  to  the  Wilson 
values . 

The  third  distribution,  M  -  B,  using  the  observed 
T  and  S,  indicates  that  the  regression  equation  is  a  close 
duplication  of  Mackenzie's  equation;  that  is,  only  81  of 
3720  predictions  differ  by  more  than  +2  m/sec.  This  is  an 
interesting  point,  for  the  regression  equation  for  sound 
velocity  is  much  simpler  in  form  than  Mackenzie's  equation. 

The  fourth  distribution  is  obtained  by  comparing  the 
Wilson  sound  velocity  values  with  the  Mackenzie  values 
computed  from  a  predicted  T  and  S.  The  resulting  residual 
distribution  takes  on  the  shape  of  a  normal  distribution, 
which  is  slightly  skewed  to  the  left.  Figure  8  shows  the 
distribution  by  means  of  histogram  of  magnitude  against 
number.  It  is  felt  that  the  resulting  distribution  enhances 
the  feasibility  of  predicting  sound  velocity  given  only  lati¬ 
tude,  longitude,  and  depth,  and  be  at  least  70%  sure  of 
being  within  9  meters/sec  of  the  true  sound  velocity. 

The  fifth  residual  distribution  of  Table  II  is  obtained 

by  evaluating  the  regression  sound  velocity  equation  using 

the  predicted  temperature  and  salinity  and  comparing  the 

results  with  Wilson's  value  from  the  card  (i.e.,  obtain  all 

w.-B.).  The  residual  distribution  here  is  almost  identical 
1  1 

with  distribution  4.  The  histogram  of  figure  8  adequately 
represents  distribution  5  as  well  as  distribution  4. 

Distribution  6  compares  the  sound  velocity  predictions 
of  Mackenzie's  sound  velocity  equation  to  those  of  the 
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Residual  Distributions 
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Table  II •  Residual  distribution  densities  for  sound 
velocity  equations. 
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regression  sound  velocity  equation  using  predicted  tempera¬ 
ture  and  salinity  values.  If  distribution  3  is  compared  with 
distribution  6  in  table  II,  it  is  clear  that  the  regression 
sound  velocity  equation  predicts  very  nearly  the  same  as 
Mackenzie's  sound  velocity  equation. 

The  nearly  normal  residual  distribution  obtained  by 
using  the  modified  Mackenzie  equation  with  the  predicted 
temperature  and  salinity  and  the  results  obtained  when  these 
sound  velocity  values  are  compared  to  Wilson's  values  for 
the  same  data,  underscores  the  random  error  in  the  data  from 
which  the  temperature,  salinity  and  sound  velocity  equations 
were  developed. 

Final  analysis  involved  computing  a  predicted  temper¬ 
ature  and  salinity  from  their  respective  regression  equa¬ 
tions  for  use  in  the  regression  sound  velocity  equation. 

The  predicted  sound  velocity  from  the  regression  equation 
(SV)  was  compared  to  Wilson's  value  for  the  same  data  at 
each  observation,  forming  3720  residuals  (Wilson's  sound 
velocity  -  regression  sound  velocity).  A  plot  of  these 
residuals  against  SV  for  each  respective  observation  reveal¬ 
ed  a  pattern  as  shown  in  figure  9.  That  is,  the  regression 
sound  velocity  equation  shows  no  unaccounted  for  effect  over 
the  range  of  the  dependent  variable  and  indicates  a  reason¬ 
ably  good  fit,  as  previously  noted  in  the  explanation  of 
figure  4.  It  was  observed  that  88%  of  the  residuals  fell 
within  this  horizontal  band  from  +12  m/s  to  -12  m/s  (i.e., 
no  indicated  lack  of  linear  or  quadratic  terms  in  the  sound 
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Figure  9.  Residual  Pattern  -  plot  of  residuals  against 

A 

SV. 

velocity  model).  This  residual  pattern  is  what  would  be 
expected  if  the  error  is  random.  The  analysis  presented 
concerning  models  3,  4,  and  5  indicates  that  the  error  in 
predictions  is  random,  though  large.  The  predictidn  of 
sound  velocity  without  costly  instrument  measurements  of 
temperature  and  salinity  may  require  that  wider  tolerances 
for  error  be  considered  acceptable.  For  example,  based  on 
time  and  cost  saved  on  instrumental  measurements  of  temper¬ 
ature  and  salinity,  a  90%  certainty  of  being  within  5  m/s 
of  the  true  sound  velocity  value  might  be  considered  adequate. 

It  is  felt  that  the  results  of  this  study  are  signifi¬ 
cant  enough  to  warrant  application  of  models  3 ,  4 ,  and  5  to 
additional  oceanographic  data,  particularly  in  squares 
surrounding  the  4°  by  4°  square  used  in  this  investigation. 
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IV.  CONCLUSIONS  AND  SUGGESTIONS  FOR  FURTHER  WORK 

The  problem  of  determining  adequate  models  for  predicting 
temperature,  salinity,  and  sound  velocity  has  been  considered. 

Sound  velocity  values  yielded  by  Wilson's  equation 
described  on  pages  9-12,  are  considered  good  enough  for 
use  in  most  scientific  work.^  The  Wilson  equation,  however, 
is  rather  complex  and  requires  an  excessive  amount  of  cal¬ 
culation.  Mackenzie's  sound  velocity  equation,  described  on 
pages  6  -  8,  is  more  appealing  to  use  than  Wilson's  equation 
because  of  its  simplicity  of  use.  The  modification  to  the 
reference  velocity  and  depth  dependency  term,  as  described 
on  page  18,  gives  Mackenzie's  equation  the  capability  of 
predicting  sound  velocities  to  within  +1  meter/second  of 
Wilson's  equation  for  all  data  considered.  Distribution  2  of 
Table  II  on  page  37  shows  this  result.  The  Mackenzie  equa¬ 
tion  was  therefore  concluded  to  be  a  convenient  and  accurate 
equation  from  which  sound  velocity  predictions  (m^)  could 
be  obtained  to  compare  with  the  regression  sound  velocity 
predictions  (SV^).  Distribution  3  in  Table  II  is  formed  by 
considering  m^  -  SV^  for  all  i,  when  the  observed  salinities 
and  temperatures  are  used  in  each  equation.  In  contrast,  dis¬ 
tribution  6  uses  the  predicted  salinities  and  temperatures  in 
each  equation. 

Two  approaches  to  the  problem  of  developing  prediction 
equations  were  used  in  this  investigation.  The  distinguish¬ 
ing  factor  between  the  two  approaches  is  whether  depth  is 
included  as  an  independent  variable. 
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Model  1  shown  on  page  23  and  Model  2  shown  on  page  28 
were  the  primary  models  considered  in  the  first  approach. 
Depth  is  not  an  independent  variable  in  Model  1  or  Model  2, 
therefore  a  prediction  equation  for  each  of  the  dependent 
variables  temperature,  salinity,  and  sound  velocity  at  each 
depth  plane  results. 

The  results  of  Models  1  and  2  are  discussed  on  pages 

23  and  29  respectively.  For  each  dependent  variable,  plots 
2 

of  R  against  depth  plane ,  and  a  against  depth  plane  for 
Model  1  appear  on  pages  24,  25,  and  26.  In  general,  all 
measures  of  adequacy  as  described  on  page  21,  and  an  exam¬ 
ination  of  residuals  (actual  -  predicted)  for  each  question,' 
fail  to  substantiate  the  regression  equations  yielded  by 
models  1  and  2  as  adequate  for  predictive  purposes. 

The  second  approach  used  in  the  study  was  to  consider' 
the  general  situation  where  depth  was  included  as  one  of  the 
Independent  variables.  Clearly,  this  resulted  in  only  one 
regression  equation  for  each  dependent  variable  temperature, 
salinity,  and  sound  velocity  which  represents  the  data  over 
all  depth  planes.  Data  manipulation  and  analysis  of  results 
is  much  faster  if  one  equation  can  be  found  to  represent 
the  data  over  all  depth  planes ,  rather  than  over  only  one 
depth  plane. 

Within  the  second  approach,  there  were  two  ways  to 
build  the  models.  First  a  large  model  of  the  form  Y  =  Z 
+  e  could  be  designed.  In  using  this  model,  only  the 
dependent  variable  would  be  changed.  This  model  would 
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therefore  be  used  three  times.  Secondly,  three  individual 
models  of  form 

n  n  n 


Z  a .  X .  +  e 
i=l  ^  ^ 


Z  b.X.  +  e 
.  11 
1=1 


for  temperature,  salinity,  and  sound  velocity,  respectively, 
could  be  developed. 

It  was  concluded  in  an  extensive  trial  and  error  model 
building  process,  in  the  search  for  suitable  regression 
models ,  that  the  individual  character  of  the  dependent 
variables  required  individual  models,  rather  than  one  large 
model  from  which  all  equations  could  be  derived.  The 
salinity  model  (model  3),  temperature  model  (model  4),  and 
sound  velocity  model  (model  5)  shown  on  pages  31,  32,  and 
33,  respectively,  are  the  models  which  gave  the  best  results 
in  the  analysis  applied. 

The  salinity  equation,  obtained  from  model  3 ,  is  a 
function  of  latitude,  longitude,  and  depth.  The  temperature 
equation,  obtained  from  model  4,  is  a  function  of  latitude, 
longitude,  depth,  salinity,  and  day-of-year.  The  final 
temperature  model  also  included  the  terms  of  the  model  pro¬ 
posed  by  Anderson®  for  predicting  sea  surface  temperature 
which  also  accounts  for  seasonal  variation.  The  sound 
velocity  equation,  obtained  from  model  5 ,  is  a  function  of 
latitude,  longitude,  depth,  temperature,  and  salinity. 

When  using  the  prediction  equations  to  arrive  at  a 
sound  velocity,  the  following  procedure  was  used. 
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Salinity  may  be  calculated  from  values  of  latitude, 
longitude,  and  depth.  These  are  independent  variables  whose 
values  may  be  chosen  by  the  user.  Once  the  salinity  value 
is  known,  and  a  particular  day  of  year  is  specified,  then 
a  temperature  value  may  be  computed.  Now  both  salinity  and 
temperature  are  defined.  These  are  the  only  two  values 
that  must  be  known  to  compute  a  predicted  sound  velocity 
value  from  either  Mackenzie's  modified  sound  velocity  equ¬ 
ation  or  the  regression  sound  velocity  equation. 

For  purposes  of  comparison,  the  following  five  sound 
velocity  values  were  found  at  each  observation  of  latitude, 
longitude,  depth,  temperature,  and  salinity:  Wilson's  sound 
velocity  value,  Mackenzie's  sound  velocity  and  the  regression 
sound  velocity  using  the  observed  temperature  and  salinity, 
and  finally  Mackenzie's  sound  velocity  and  the  regression 
sound  velocity  using  the  predicted  temperature  and  salinity. 

An  assumption  that  Wilson's  sound  velocity  values 
were  the  most  accurate,  provided  a  standard  of  comparison 
for  the  sound  velocity  calculations  from  Mackenzie's  equation 
and  the  regression  equation.  For  example,  using  an  observed 
temperature  and  salinity,  a  sound  velocity  value  was  cal¬ 
culated  from  Mackenzie's  equation.  This  sound  velocity 
value  was  then  subtracted  from  Wilson's  value  calculated 
from  the  same  data,  and  the  difference  (w.  -  m.)  was  observed. 
This  was  performed  at  each  of  the  3720  data  points. 

Distribution  1  of  Table  II  was  formed  to  see  how  these 
residuals  were  distributed  about  Wilson's  predictions.  If 
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the  residual  happened  to  be  of  magnitude  .9  m/sec,  the  count 
of  all  residuals  falling  in  the  interval  0-1  m/sec  was 
increased  by  one.  Distribution  4  Table  II  was  formed  in 
the  same  manner  using  Mackenzie's  equation  with  predicted 
temperature  and  salinity.  Similar  distributions  (No.  2  and 
No.  5  -  Table  II)  were  formed  regarding  the  regression  sound 
velocity  predictions  for  observed,  as  well  as  predicted 
temperature  and  salinity.  Two  additional  distributions 
(No.  3  and  No.  6  -  Table  II)  compare  Mackenzie's  sound  veloc¬ 
ity  predictions  to  the  regression  sound  velocity  predictions 
for  observed  then  predicted  temperature  and  salinity,  res¬ 
pectively.  The  six  distributions  described  above  are  summar¬ 
ized  in  Table  II  and  reveal  some  interesting  points  about 
the  sound  velocity  equations  and  their  predictive  abilities. 

When  using  the  observed  (instrumental)  temperature  and 
salinity  in  calculating  sound  velocity  from  a  given  equation, 
Wilson's,  Mackenzie's  and  the  regression  sound  velocity 
equations  all  predict  sound  velocity  values  very  close  to 
one  another  as  distributions  1,  2,  and  3  of  Table  II  point 
out.  The  regression  sound  velocity  equation  resulting  from 
model  5,  however,  is  simpler  in  form  and  easier  to  use  than 
Wilson's  equation  or  Mackenzie's  equation. 

The  residual  distributions  (No.  4  and  No.  5  -  Table  II), 
obtained  by  using  predicted  temperatures  and  salinities  in 
computing  sound  velocity  values  from  Mackenzie's  equation 
and  the  regression  sound  velocity  equation,  are  encouraging 
in  that  they  are  nearly  normal  about  Wilson's  sound  velocity 
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predictions  as  shown  in  figure  8.  This  form  of  residual 
distribution  underscores  the  random  error  in  the  data  from 
which  the  regression  equations  were  developed,  and  enhances 
the  feasibility  of  predicting  sound  velocity  without  the 
need  for  on  location,  instrument  measurement  of  temperature 
and  salinity. 

Figure  9  shows  a  plot  of  the  residuals  (w^^  -  SV^^) 
against  the  dependent  variable  predictions  (SV^)  for  distri¬ 
bution  5,  according  to  the  analysis  described  on  pages  21 
and  22.  Figure  9  differs  from  figure  8  in  that  figure  8  is 
a  plot  of  number  of  residuals  versus  magnitude  of  residual; 
figure  9  is  a  plot  of  magnitude  of  residual  versus  magnitude 
of  the  dependent  variable  value  (SV^).  This  plot  extends 
Over  the  entire  range  of  the  dependent  variable.  The  plot 
in  figure  9  is  that  of  case  A  of  figure  4,  page  22.  The 
residual  pattern  is  roughly  a  horizontal  band,  indicating 
no  significant  unaccounted  for  effects  (linear  or  quadratic) 

.  in  the  model  over  the  range  of  the  dependent  variable .  Since 
the  plot  of  (w.  -  SV.)  versus  SV^ ,  for  all  i,  is  a  horizontal 
band,  the  prediction  equation  (SV)  is  predicting  as  would  be 
expected  if  the  errors  in  the  raw  data  for  which  SV  was 
developed,  were  random. 

The  regression  sound  velocity  predictions  obtained 
by  using  predicted  salinities  and  temperature,  are  not  as 
good  as  might  be  desired  or  needed  for  use  in  scientific 
work.  Distribution  5  of  Table  II  shows  528  cases  where  the 
regression  sound  velocity  equation  predicted  values  30  m/sec 
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previous  runs.  In  addition,  the  residuals  were  quite  stable. 
These  results  substantiated  the  thought  that  models  3 ,  4 , 
and  5  would  produce  acceptable  results  if  the  bad  data  were 
removed.  Based  on  these  results,  it  appears  feasible  that 
the  need  for  on-location  observations  of  salinity  and  temp¬ 
erature  might  be  eliminated  in  the  future. 

In  future  work  on  this  topic,  some  data  screening  de¬ 
vice  should  be  implemented  to  filter  out  obvious  errors  be¬ 
fore  the  final  prediction  equations,  particularly  for  salinity 
and  temperature,  are  developed.  This  would  improve  the  pre¬ 
dictive  ability  of  the  salinity  and  temperature  equations 
and  thus  improve  the  regression  sound  velocity  predictions. 

One  such  data  screening  device ,  which  might  be  used  in 

9 

future  investigations,  is  suggested  by  Anderson  .  He  pro¬ 
poses  that  a  regression  equation  be  fit  to  all  raw  data 
available  as  was  done  in  this  study.  The  residuals  (observed  - 
predicted)  would  then  be  examined.  If  the  residual  is  +2 
standard  deviations  from  the  mean,  that  data  will  be  used  in 
further  analyses,  if  not,  that  data  point  will  be  eliminated 
from  further  consideration.  A  regression  equation  is  then 
fit  to  the  remaining  data.  This  procedure  has  the  facility 
of  immediately  identifying  erroneous  data  or  gross  instru¬ 
ment  error. 

An  alternative  to  the  above  data  screening  procedure 
would  be  to  compute  the  mean  and  standard  deviation  of  the 
data  Set  in  question,  then  eliminate  all  data  which  falls 
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outside  +2  or  +3  standard  deviations  from  the  mean.  A 
regression  equation  could  then  be  fit  to  the  remaining  data. 

A  number  of  2°  by  2°  and  4°  by  4°  squares  adjacent  to 
the  area  36°  -  40°N  latitude  and  68°  -  72°W  longitude  were 
examined.  The  resulting  prediction  equations  were  quite 
similar  in  form  to  those  determined  for  the  original  square. 
However,  the  coefficients  of  the  independent  variables  were 
obviously  somewhat  different.  In  general,  the  prediction 
equations  for  salinity,  temperature  and  sound  velocity  in 
the  surrounding  areas  produced  results  that  were  quite  good. 

for  future  study  on  this  topic,  analysis  similar  to 
that  discussed  in  Chapter  III  of  this  study,  should  be  per¬ 
formed  on  several  additional  2°  x  2°  or  4°  x  4°  squares  sur- 
rounding  the  area  36°  -  40°N  latitude  and  68°  -  72°W  longitude. 
Based  on  the  results  from  a  number  of  surrounding  squares 
that  were  examined  in  this  study,  the  resulting  regression 
equations  should  be  similar  to  the  ones  resulting  from  models 
j3,  4,  and  5  described  in  Chapter  III.  These  regression  equa¬ 
tions  could  then  be  examined  for  patterns  and  possibly 
generalized  equations  for  salinity,  temperature,  and  sound 
velocity  would  become  evident  which  could  be  applicable  to  a 
much  expanded  oceanographic  area. 

Physical  characteristics  of  the  oceanographic  environ¬ 
ment  are  difficult  to  represent  with  rigid  equations ,  as  is 
possible  in  many  areas  of  the  physical  sciences,  because  of 
their  dynamic  character.  The  laws  of  nature,  however,  are 
characterized  by  certain  patterns  and  this  environment  will 
eventually  be  represented  too. 
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